730 research outputs found

    Complex Event Processing as a Service in Multi-Cloud Environments

    Get PDF
    The rise of mobile technologies and the Internet of Things, combined with advances in Web technologies, have created a new Big Data world in which the volume and velocity of data generation have achieved an unprecedented scale. As a technology created to process continuous streams of data, Complex Event Processing (CEP) has been often related to Big Data and used as a tool to obtain real-time insights. However, despite this recent surge of interest, the CEP market is still dominated by solutions that are costly and inflexible or too low-level and hard to operate. To address these problems, this research proposes the creation of a CEP system that can be offered as a service and used over the Internet. Such a CEP as a Service (CEPaaS) system would give its users CEP functionalities associated with the advantages of the services model, such as no up-front investment and low maintenance cost. Nevertheless, creating such a service involves challenges that are not addressed by current CEP systems. This research proposes solutions for three open problems that exist in this context. First, to address the problem of understanding and reusing existing CEP management procedures, this research introduces the Attributed Graph Rewriting for Complex Event Processing Management (AGeCEP) formalism as a technology- and language-agnostic representation of queries and their reconfigurations. Second, to address the problem of evaluating CEP query management and processing strategies, this research introduces CEPSim, a simulator of cloud-based CEP systems. Finally, this research also introduces a CEPaaS system based on a multi-cloud architecture, container management systems, and an AGeCEP-based multi-tenant design. To demonstrate its feasibility, AGeCEP was used to design an autonomic manager and a selected set of self-management policies. Moreover, CEPSim was thoroughly evaluated by experiments that showed it can simulate existing systems with accuracy and low execution overhead. Finally, additional experiments validated the CEPaaS system and demonstrated it achieves the goal of offering CEP functionalities as a scalable and fault-tolerant service. In tandem, these results confirm this research significantly advances the CEP state of the art and provides novel tools and methodologies that can be applied to CEP research

    Processamento de eventos complexos como serviço em ambientes multi-nuvem

    Get PDF
    Orientadores: Luiz Fernando Bittencourt, Miriam Akemi Manabe CapretzTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O surgimento das tecnologias de dispositivos móveis e da Internet das Coisas, combinada com avanços das tecnologias Web, criou um novo mundo de Big Data em que o volume e a velocidade da geração de dados atingiu uma escala sem precedentes. Por ser uma tecnologia criada para processar fluxos contínuos de dados, o Processamento de Eventos Complexos (CEP, do inglês Complex Event Processing) tem sido frequentemente associado a Big Data e aplicado como uma ferramenta para obter informações em tempo real. Todavia, apesar desta onda de interesse, o mercado de CEP ainda é dominado por soluções proprietárias que requerem grandes investimentos para sua aquisição e não proveem a flexibilidade que os usuários necessitam. Como alternativa, algumas empresas adotam soluções de baixo nível que demandam intenso treinamento técnico e possuem alto custo operacional. A fim de solucionar esses problemas, esta pesquisa propõe a criação de um sistema de CEP que pode ser oferecido como serviço e usado através da Internet. Um sistema de CEP como Serviço (CEPaaS, do inglês CEP as a Service) oferece aos usuários as funcionalidades de CEP aliadas às vantagens do modelo de serviços, tais como redução do investimento inicial e baixo custo de manutenção. No entanto, a criação de tal serviço envolve inúmeros desafios que não são abordados no atual estado da arte de CEP. Em especial, esta pesquisa propõe soluções para três problemas em aberto que existem neste contexto. Em primeiro lugar, para o problema de entender e reusar a enorme variedade de procedimentos para gerência de sistemas CEP, esta pesquisa propõe o formalismo Reescrita de Grafos com Atributos para Gerência de Processamento de Eventos Complexos (AGeCEP, do inglês Attributed Graph Rewriting for Complex Event Processing Management). Este formalismo inclui modelos para consultas CEP e transformações de consultas que são independentes de tecnologia e linguagem. Em segundo lugar, para o problema de avaliar estratégias de gerência e processamento de consultas CEP, esta pesquisa apresenta CEPSim, um simulador de sistemas CEP baseado em nuvem. Por fim, esta pesquisa também descreve um sistema CEPaaS fundamentado em ambientes multi-nuvem, sistemas de gerência de contêineres e um design multiusuário baseado em AGeCEP. Para demonstrar sua viabilidade, o formalismo AGeCEP foi usado para projetar um gerente autônomo e um conjunto de políticas de auto-gerenciamento para sistemas CEP. Além disso, o simulador CEPSim foi minuciosamente avaliado através de experimentos que demonstram sua capacidade de simular sistemas CEP com acurácia e baixo custo adicional de processamento. Por fim, experimentos adicionais validaram o sistema CEPaaS e demonstraram que o objetivo de oferecer funcionalidades CEP como um serviço escalável e tolerante a falhas foi atingido. Em conjunto, esses resultados confirmam que esta pesquisa avança significantemente o estado da arte e também oferece novas ferramentas e metodologias que podem ser aplicadas à pesquisa em CEPAbstract: The rise of mobile technologies and the Internet of Things, combined with advances in Web technologies, have created a new Big Data world in which the volume and velocity of data generation have achieved an unprecedented scale. As a technology created to process continuous streams of data, Complex Event Processing (CEP) has been often related to Big Data and used as a tool to obtain real-time insights. However, despite this recent surge of interest, the CEP market is still dominated by solutions that are costly and inflexible or too low-level and hard to operate. To address these problems, this research proposes the creation of a CEP system that can be offered as a service and used over the Internet. Such a CEP as a Service (CEPaaS) system would give its users CEP functionalities associated with the advantages of the services model, such as no up-front investment and low maintenance cost. Nevertheless, creating such a service involves challenges that are not addressed by current CEP systems. This research proposes solutions for three open problems that exist in this context. First, to address the problem of understanding and reusing existing CEP management procedures, this research introduces the Attributed Graph Rewriting for Complex Event Processing Management (AGeCEP) formalism as a technology- and language-agnostic representation of queries and their reconfigurations. Second, to address the problem of evaluating CEP query management and processing strategies, this research introduces CEPSim, a simulator of cloud-based CEP systems. Finally, this research also introduces a CEPaaS system based on a multi-cloud architecture, container management systems, and an AGeCEP-based multi-tenant design. To demonstrate its feasibility, AGeCEP was used to design an autonomic manager and a selected set of self-management policies. Moreover, CEPSim was thoroughly evaluated by experiments that showed it can simulate existing systems with accuracy and low execution overhead. Finally, additional experiments validated the CEPaaS system and demonstrated it achieves the goal of offering CEP functionalities as a scalable and fault-tolerant service. In tandem, these results confirm this research significantly advances the CEP state of the art and provides novel tools and methodologies that can be applied to CEP researchDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação140920/2012-9CNP

    CEPSim: Modelling and Simulation of Complex Event Processing Systems in Cloud Environments

    Get PDF
    The emergence of Big Data has had profound impacts on how data are stored and processed. As technologies created to process continuous streams of data with low latency, Complex Event Processing (CEP) and Stream Processing (SP) have often been related to the Big Data velocity dimension and used in this context. Many modern CEP and SP systems leverage cloud environments to provide the low latency and scalability required by Big Data applications, yet validating these systems at the required scale is a research problem per se. Cloud computing simulators have been used as a tool to facilitate reproducible and repeatable experiments in clouds. Nevertheless, existing simulators are mostly based on simple application and simulation models that are not appropriate for CEP or for SP. This article presents CEPSim, a simulator for CEP and SP systems in cloud environments. CEPSim proposes a query model based on Directed Acyclic Graphs (DAGs) and introduces a simulation algorithm based on a novel abstraction called event sets. CEPSim is highly customizable and can be used to analyze the performance and scalability of user-defined queries and to evaluate the effects of various query processing strategies. Experimental results show that CEPSim can simulate existing systems in large Big Data scenarios with accuracy and precision

    Evaluation of Particle Swarm Optimization Applied to Grid Scheduling

    Get PDF
    The problem of scheduling independent users’ jobs to resources in Grid Computing systems is of paramount importance. This problem is known to be NP-hard, and many techniques have been proposed to solve it, such as heuristics, genetic algorithms (GA), and, more recently, particle swarm optimization (PSO). This article aims to use PSO to solve grid scheduling problems, and compare it with other techniques. It is shown that many often-overlooked implementation details can have a huge impact on the performance of the method. In addition, experiments also show that the PSO has a tendency to stagnate around local minima in high-dimensional input problems. Therefore, this work also proposes a novel hybrid PSO-GA method that aims to increase swarm diversity when a stagnation condition is detected. The method is evaluated and compared with other PSO formulations; the results show that the new method can successfully improve the scheduling solution

    Data management in cloud environments: NoSQL and NewSQL data stores

    Get PDF
    : Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages

    Service Evolution Patterns

    Get PDF
    Service evolution is the process of maintaining and evolving existing Web services to cater for new requirements and technological changes. In this paper, a service evolution model is proposed to analyze service dependencies, identify changes on services and estimate impact on consumers that will use new versions of these services. Based on the proposed service evolution model, four service evolution patterns are described: compatibility, transition, split-map, and merge-map. These proposed patterns provide reusable templates to encourage well-defined service evolution while minimizing issues that arise otherwise. They can be applied in the service evolution scenario where a single service is used by many, possibly unknown, consumers’ applications. In such a scenario, providers evolve their services independently from consumers, which might cause unexpected errors and incur unpredicted impact on the dependent consumers\u27 applications. Therefore, providers can use these patterns to estimate the impact that changes to be introduced to their services may cause on their consumers, and to allow consumers smoothly migrate to the newest version of the service

    Network and Energy-Aware Resource Selection Model for Opportunistic Grids

    Get PDF
    Due to increasing hardware capacity, computing grids have been handling and processing more data. This has led to higher amount of energy being consumed by grids; hence the necessity for strategies to reduce their energy consumption. Scheduling is a process carried out to define in which node tasks will be executed in the grid. This process can significantly impact the global system performance, including energy consumption. This paper focuses on a scheduling model for opportunistic grids that considers network traffic, distance between input files and execution node as well as the execution node status. The model was tested in a simulated environment created using GreenCloud. The simulation results of this model compared to a usual approach show a total power consumption savings of 7.10%

    Deep Neural Networks With Confidence Sampling For Electrical Anomaly Detection

    Get PDF
    The increase in electrical metering has created tremendous quantities of data and, as a result, possibilities for deep insights into energy usage, better energy management, and new ways of energy conservation. As buildings are responsible for a significant portion of overall energy consumption, conservation efforts targeting buildings can provide tremendous effect on energy savings. Building energy monitoring enables identification of anomalous or unexpected behaviors which, when corrected, can lead to energy savings. Although the available data is large, the limited availability of labels makes anomaly detection difficult. This research proposes a deep semi-supervised convolutional neural network with confidence sampling for electrical anomaly detection. To achieve semi-supervised learning, two sub-networks are used: the first performs reconstruction and uses unlabelled data, while the second performs classification with labelled data. The two sub-networks overlap: the encoder parameters are shared between the two. To quantify anomaly detection confidence, a valuable metric in anomaly detection, the network uses a dropout sampling method. The proposed approach has been evaluated with real-world electrical data from systems such as HVAC, lighting, and heat pumps. The results demonstrated the accuracy of the proposed anomaly detection solution

    A Gamification Framework for Sensor Data Analytics

    Get PDF
    The Internet of Things (IoT) enables connected objects to capture, communicate, and collect information over the network through a multitude of sensors, setting the foundation for applications such as smart grids, smart cars, and smart cities. In this context, large scale analytics is needed to extract knowledge and value from the data produced by these sensors. The ability to perform analytics on these data, however, is highly limited by the difficulties of collecting labels. Indeed, the machine learning techniques used to perform analytics rely upon data labels to learn and to validate results. Historically, crowdsourcing platforms have been used to gather labels, yet they cannot be directly used in the IoT because of poor human readability of sensor data. To overcome these limitations, this paper proposes a framework for sensor data analytics which leverages the power of crowdsourcing through gamification to acquire sensor data labels. The framework uses gamification as a socially engaging vehicle and as a way to motivate users to participate in various labelling tasks. To demonstrate the framework proposed, a case study is also presented. Evaluation results show the framework can successfully translate gamification events into sensor data labels

    Energy Slices: Benchmarking with Time Slicing

    Get PDF
    Benchmarking makes it possible to identify low-performing buildings, establishes a baseline for measuring performance improvements, enables setting of energy conservation targets, and encourages energy savings by creating a competitive environment. Statistical approaches evaluate building energy efficiency by comparing measured energy consumption to other similar buildings typically using annual measurements. However, it is important to consider different time periods in benchmarking because of differences in their consumption patterns. For example, an office can be efficient during the night, but inefficient during operating hours due to occupants’ wasteful behavior. Moreover, benchmarking studies often use a single regression model for different building categories. Selecting the regression model based on actual data would ensure that the model fits the data well. Consequently, this paper proposes Energy Slices, an energy benchmarking approach with time slicing for existing buildings. Time slicing enables separation of time periods with different consumption patterns. The regression model suited for the specific scenario is selected using cross validation, which ensures that the model performs well on previously unseen data. The evaluation is carried out on a case study involving two sports arenas; event energy efficiency is benchmarked to identify low-performing events. The case study demonstrates the Energy Slice procedure and shows the importance of model selection
    corecore